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The zero-degree calorimeter (ZDC) plays a crucial role toward determining the centrality in the Cooling- 
Storage-Ring External-target Experiment (CEE) at the Heavy Ion Research Facility in Lanzhou (HIRFL). A 
boosted decision tree (BDT) multi-classification algorithm was employed to classify the centrality of the col- 
lision events based on the raw features from ZDC such as the number of fired channels and deposited energy. 
The data from simulated 7°°U + 7°°U collisions at 500 MeV/u, generated by the IQMD event generator and 
subsequently modeled using the GEANT4 package, were employed to train and test the BDT model. The results 
showed the high accuracy of the multiclassification model adopted in ZDC for centrality determination, which 
is robust against variations in different factors of detector geometry and response. This study demonstrates the 
good performance of CEE-ZDC in determining the centrality in nucleus-nucleus collisions. 
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I. INTRODUCTION 


The primary objective of conducting heavy-ion collisions 
at different beam energies is to investigate strong interaction 
matter and understand the QCD phase diagram. The phase 
diagram provides information on the phase transition and 
critical point of a strongly interacting system, where hadron 
gases exist at lower temperatures and low baryon densities; at 
higher temperatures or densities, the hadronic boundary dis- 
appears, and confined quarks move freely throughout the sys- 
tem [1]. The Beam Energy Scan program of RHIC-STAR 
aims to approach the possible critical point from the high- 
energy side. However, it is essential to study the phase di- 
agram of the hadron phase and approach the critical point 
from the low-energy side [2-4]. The Cooling-Storage-Ring 
External-target Experiment (CEE) at the Heavy Ion Research 
Facility in Lanzhou (HIRFL), with its advanced spectrome- 
ter, provides significant opportunities for studying phase dia- 
grams at extremely high net baryon density levels with ener- 
gies of several hundred AMeV [5]. 

The zero-degree calorimeter (ZDC), one of the subdetec- 
tors of CEE in the forward rapidity region, is designed to ac- 
curately determine the centrality and reaction plane of colli- 
sion events [6]. Collision events are typically classified into 
centrality classes representing certain fractions of the total re- 
action cross-section corresponding to specific intervals of the 
impact parameter b [7]. The impact parameter b is essen- 
tial for understanding the initial overlap region of the col- 
liding nuclei in heavy-ion collisions; it represents the dis- 
tance between the nuclei centers in the plane transverse to 
the beam axis and determines the size and shape of the re- 
sulting medium. However, the impact parameter b is not di- 
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rectly measurable in the experiments. To estimate central- 
ity experimentally, raw observables that scale monotonically 
with impact parameter can be used for classification accord- 
ing to centrality, for example, the reconstructed tracks with 
central barrel tracking detectors or the deposited energy in 
the forward calorimeters. Accurate centrality determination 
is a baseline for many physical analyses in heavy-ion colli- 
sion experiments [8], particularly when searching for observ- 
ables sensitive to a possible phase transition or critical point 
through analysis of fluctuations and correlations. 

In recent years, Machine Learning (ML) methods have 
gained significant attention for determining the centrality of 
heavy ion collisions [8, 9]. Previous studies treated centrality 
determination as a regression problem on impact parameters 
and utilized combined information from central tracking sys- 
tems and forward calorimeters to train ML models. However, 
to avoid autocorrelation in the physics analysis, this study 
adopts a machine-learning approach that utilizes raw experi- 
mental features from a forward calorimeter to determine cen- 
trality. We report the application of a multi-classification ML 
algorithm based on Boosted Decision Trees (BDT) as a cen- 
trality classifier using only ZDC in ?38U + ?°°U collisions 
at 500 MeV/u at the CEE. The ML inputs were generated 
using the Isospin dependent Quantum Molecular Dynamics 
(IQMD) generator [10]. In addition, we present the efficiency 
and purity measures related to the centrality determination 
performance of the ZDC with a model application. 


II. CEE-ZDC 


The CEE, which utilizes fixed-target-mode heavy-ion col- 
lisions, is the first large-scale experimental nuclear device op- 
erating in the GeV energy region in China. It is equipped with 
a set of sub-detectors, as shown in Fig. 1(a). The detector 
system comprises a beam monitor, TO detector [5], time pro- 
jection chamber (TPC) [11], inner time-of-flight (TOF) de- 
tector [12], large superconducting dipole magnet, multiwire 
drift chamber (MWDC) [13], external time-of-flight (e TOF) 
detector [14], and zero-degree calorimeter (ZDC) [6]. 


The purpose of the ZDC is to detect particle fragments in 
the forward rapidity region following semi-central and pe- 
ripheral collisions, which provides vital information for the 
precise reconstruction of the centrality and reaction plane of 
collision events [6, 15]. The ZDC is centrally positioned 
at the end of the CEE, covering a pseudorapidity range of 
1.8 < ņ < 4.8. The ZDC utilizes a symmetrical and fan- 
shaped layout with eight radial and 24 angular sections and a 
maximum radius of 1 m. The detector comprises trapezoidal 
modules equipped with uniform plastic scintillators that are 
coupled with a light guide and connected to photomultiplier 
tubes (PMT) to convert scintilation light into charge signals. 
To obtain a comprehensive signal, each module provides two 
charge signals from the two dynodes of each PMT that are 
transmitted to two separate readout channels, resulting in 384 
(24 x 8 x 2) channels for the ZDC. 


Fig. 1. (Color online) (a) CEE detector schematic layout. (b) ZDC 
detector layout. 


Il. MODEL TRAINING WITH SIMULATED EVENT 


The simulated data were generated by simulating ?38U + 
2380 collisions at 500 MeV/u using an IQMD generator [10]. 
The generated particles were then transported through the ap- 
paratus using the GEANT4 package [16]. Determining the 
centrality with only one forward rapidity detector, such as 
ZDC, is challenging even when employing ML algorithms. 
Previous ML-based studies on centrality determination relied 
on information from multiple subsystems within the detec- 
tor, such as tracks reconstructed from central barrel detec- 
tors and deposited energy in forward calorimeters, revealing 
a strong correlation between the centrality class and observ- 
ables. CEE-ZDC is a nontracking detector, and the number of 
spectator nucleons in a nucleus-nucleus collision is expected 
to be proportional to the deposited energy and number of fired 
channels in the ZDC. However, the presence of a beam hole at 
the center of ZDC and the limited detector acceptance result 
in a weak monotonic dependence between the impact param- 
eters and observables, as illustrated in Fig. 2(a), which shows 
the number of fired channels and Fig. 2(b), which shows the 
energy deposited in ZDC. 

Potential improvements in centrality determination can be 
achieved by utilizing data from ZDC-subrings in conjunction 
with the ZDC as an additional feature in the ML task. More- 
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Fig. 2. (Color online) (a) The number of fired channels in ZDC as a 
function of impact parameter. (b) The deposited energy in ZDC as a 
function of impact parameter. 


over, it may be advantageous to use the energy deposited 
in the ZDC ring-by-ring and the number of event-by-event 
fired channels and to exploit all inherent correlations between 
modules. Fig. 3(a) displays the probability distribution of the 
fired ZDC channels in the impact parameter range 7 < b < 10 
fm as well as the probability distribution of the deposited en- 
ergy of ZDC rings in the impact parameter range 0 < b < 3 
fm shown in Fig. 3(b). The complex pattern and nontrivial 
decision boundary among the event centrality classes present 
an ideal opportunity for applying ML techniques. 
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Fig. 3. (Color online) (a) Probability distribution of fired ZDC chan- 
nels in impact parameter interval of 7 < b < 10 fm. (b) Probability 
distribution of deposited energy of ZDC rings in impact parameter 
interval of O < b < 3 fm. 


Boosted Decision Trees (BDT), a family of popular su- 
pervised learning algorithms for classification and regression 
problems, are extensively used to analyze data in high-energy 
physics experiments. In this study, extreme gradient boosting 
(XGboost), a powerful BDTs based on the gradient boosting 
method, was adopted to solve multi-classification problems 
for centrality determination. The physical features used as 
the inputs for model training are the deposited energy in the 
full ZDC and ZDC substrings as well as the number of fired 
channels in ZDC. The simulated data were divided into three 
centrality classes based on the impact parameters listed in Ta- 
ble. 1. The samples were divided into training and test sam- 
ples of equal size for each centrality class. A state-of-the-art 
machine learning hyperparameter optimization with Optuna 


was adopted to speed up optimization time and achieve the 
best performance of the training models [17]. 


TABLE 1. The centrality classes with respect to the impact parame- 
ter b intervals 


Centrality class | b interval (fm) 
Central 0<b<3 
Semi-Central 3<b<7 
Peripheral 7<b6<10 


IV. PERFORMANCE OF THE ML MODELS 


The machine learning model was applied to both the train- 
ing and test sets to visualize the distributions of the ML output 
scores and to check for consistency between the two sets. For 
classification with three centrality classes (p;), the model gen- 
erates three scores representing the probability of belonging 
to each class considered. According to construction, the prob- 
abilities for the centrality classes sum to one Cea pi = 1). 
Fig. 4 illustrates the probability distributions of the central 
(a) and peripheral classes (b) for both the training and test 
sets. The probability distributions were close to unity for each 
probability distribution corresponding to the respective true 
class, whereas the other two distributions shifted toward zero. 
The probability density functions of the training and test sam- 
ples for each centrality class agreed well, indicating that the 
model did not overfit. 
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Fig. 4. (Color online) The probability distributions of belonging to 
the central class (a) and peripheral class (b) for both the training and 
test sets. 


The Receiver Operating Characteristic (ROC) curve is 
commonly used to evaluate the performance of a classifica- 
tion model by plotting the true-positive rate against the false- 
positive rate for various threshold settings. The area under the 
ROC curve, known as ROC AUC, provides a global measure 
of the model performance, ranging from 0.5 (random classifi- 
cation) to 1 (perfect classification), independent of the thresh- 
old and class distribution [18]. However, for multi-class clas- 
sification, the ROC curve cannot be directly defined, and the 
”One-vs-One” approach is used to compute the overall aver- 
age of the individual ROC AUCs for each pair of classes. The 
ROC curves and ROC AUC values obtained for the test set are 


shown in Fig. 5. The high final ROC AUC value of approxi- 
mately 0.96 indicates that the BDT model is highly effective 
in determining centrality. 
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Fig. 5. (Color online) ROC curves and AUCs with respect to differ- 
ent ’One-vs-One”’ cases are shown with the different line colors. 


V. EFFICIENCY AND PURITY OF THE CENTRALITY 
CLASSIFICATION 


The performance of the centrality classification model was 
evaluated by calculating its efficiency and purity based on 
ML output scores. Efficiency refers to the fraction of cor- 
rectly classified events, whereas purity measures the fraction 
of events correctly classified for a particular centrality class 
out of all the events assigned to that class. The efficiency ver- 
sus purity of the multiclassification models for each central- 
ity class is shown in Fig. 6, where the red, green, and blue 
solid lines represent the central, semicentral, and peripheral 
classes, respectively. The peripheral class was the most effec- 
tively classified, and the central class was more challenging 
than the semi-central class in higher-efficiency regions. The 
values listed in Table. 2 indicate that even at very high purity 
levels, the efficiency of the peripheral class is not significantly 
compromised, and both the central and semi-central classes 
exhibit promising efficiency values at high purity. These re- 
sults indicate that the ML-based event centrality determina- 
tion utilized in ZDC is effective. 


TABLE 2. Efficiency and purity values for different centrality 
classes 


Efficiency \_Class 
Central | Semi-Central | Peripheral 
Purity 
Purity = 90% 67% 66% 97% 
Purity = 95% 41% 47% 94% 
Purity = 98% 11% 24% 93% 


In addition, to evaluate the performance of the central- 
ity determination with ZDC, the effects of several factors 
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Fig. 6. (Color online) Efficiency versus purity of the multi- 
classification models for each centrality class. The red, green, and 
blue lines represent the central, semi-central, and peripheral classes, 
respectively. 


related to the configuration of ZDC in the simulation data 
were systematically investigated. These factors include the 
thickness of the plastic scintillator of ZDC detector, hit effi- 
ciency, energy resolution, and heavy nuclei with or without 
de-excitation (tunable settings in IQMD). The ZDC plastic 
scintillator thickness was varied from 1 to 4 cm, and the hit 
efficiency was varied from 90% to 95%. The deposited en- 
ergy was also smeared with different sigma values of Gaus- 
sian distributions. As illustrated in Fig. 7, the red, green, and 
blue lines indicate central, semi-central, and peripheral col- 
lisions, respectively. Changes in these factors are depicted 
by distinct line styles. The results indicated that the effects 
of these factors on the purity and efficiency of the centrality 
classification were minor. Among the tested factors, the ZDC 
detector thickness had the most significant impact, although 
its effect was relatively small. In conclusion, this study sug- 
gests that the multi-classification adopted in ZDC is robust 
against variations in these factors, indicating the potential for 
reliable and accurate classification of centrality using ZDC. 


VI. SUMMARY 


This study aimed to determine the centrality class of 
nucleus-nucleus collisions at the CEE-ZDC detector using 
a multi-classification model based on the XGBoost classi- 
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